Data Mining with Extended Symbolic Models
نویسندگان
چکیده
Symbolic modeling of data with decision trees or decision rules has a certain appeal to data mining application developers. The computationally e cient nature of the modeling methodology, and the inbuilt explanatory nature of the models that are generated, are two often cited reasons for the preferred use of these methods. Traditionally, the applications of these methods had been restricted to classi cation modeling. Recent extensions to these methods employing ideas from statistics and machine learning have resulted in more general frameworks that continue to exhibit the underlying characteristics but apply to a much wider class of applications. These extended symbolic modeling methodologies permit exciting new application avenues, including probabilistic modeling, text mining, and integrating data mining into knowledge-based frameworks. Highlights of work in this area in the data abstraction research group at IBM's T.J. Watson Research Center will be presented.
منابع مشابه
A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining
Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...
متن کاملGptips 2
GPTIPS is a free, open source MATLAB based software platform for symbolic data mining (SDM). It uses a ‘multigene’ variant of the biologically inspired machine learning method of genetic programming (MGGP) as the engine that drives the automatic model discovery process. Symbolic data mining is the process of extracting hidden, meaningful relationships from data in the form of symbolic equations...
متن کاملElite Bases Regression: A Real-time Algorithm for Symbolic Regression
Symbolic regression is an important but challenging research topic in data mining. It can detect the underlying mathematical models. Genetic programming (GP) is one of the most popular methods for symbolic regression. However, its convergence speed might be too slow for large scale problems with a large number of variables. This drawback has become a bottleneck in practical applications. In thi...
متن کاملFar beyond the classical data models: symbolic data analysis
This paper introduces symbolic data analysis, explaining how it extends the classical data models to take into account more complete and complex information. Several examples motivate the approach, before the modeling of variables assuming new types of realizations are formally presented. Some methods for the (multivariate) analysis of symbolic data are presented and discussed. This is however ...
متن کاملEvidence Sets: Modeling Subjective Categories
Zadeh’s Fuzzy Sets are extended with the Dempster-Shafer Theory of Evidence into a new mathematical structure called Evidence Sets, which can capture more efficiently all recognized forms of uncertainty in a formalism that explicitly models the subjective context dependencies of linguistic categories. A belief-based theory of Approximate Reasoning is proposed for these structures. Evidence sets...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998